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Abstract. Although cuckoo hashing has significant applications in both 
theoretical and practical settings, a relevant downside is that it requires 
lookups to multiple locations. In many settings, where lookups are ex- 
pensive, cuckoo hashing becomes a less compelling alternative. One such 
standard setting is when memory is arranged in large pages, and a ma- 
jor cost is the number of page accesses. We propose the study of cuckoo 
hashing with pages, advocating approaches where each key has several 
possible locations, or cells, on a single page, and additional choices on 
a second backup page. We show experimentally that with k cell choices 
on one page and a single backup cell choice, one can achieve nearly the 
same loads as when each key has k + 1 random cells to choose from, with 
most lookups requiring just one page access, even when keys are placed 
online using a simple algorithm. While our results are currently experi- 
mental, they suggest several interesting new open theoretical questions 
for cuckoo hashing with pages. 



1 Introduction 

Standard cuckoo hashing places keys into a hash table by providing each key 
with k cells determined by hash functions. Each cell can hold one key, and each 
key must be located in one of its cells. As new keys are inserted, keys may 
have to move from one alternative to another to make room for the new key. 
Cuckoo hashing provides high space utilization and worst-case constant-time 
lookups, making it an attractive hashing variant, with useful applications in 
both theoretical and practical settings, e.g., [2,8,10,19,20]. 

Perhaps the most significant downside of cuckoo hashing, however, is that 
it potentially requires checking multiple cells randomly distributed throughout 
the table. In many settings, such random access lookups are expensive, making 
cuckoo hashing a less compelling alternative. As a comparison, standard lin- 
ear probing works well for many settings where memory is split into (not too 
small) chunks, such as cache lines; in such settings, with suitably small loads, 
the average number of memory accesses is usually very close to 1. 



* Research supported by DFG grant DI 412/10-1. 

'* Research supported by NSF grants IIS-0964473 and CCF-0915922. 



In this paper, we consider cuckoo hashing under a setting where memory is 
arranged into pages, and the primary cost is the number of page accesses. In 
such a setting, a natural scheme to minimize this number might be to first hash 
each key to a page, and then keep a separate cuckoo hash table in each page. 
This limits the number of pages examined to one, and maintains the constant 
lookup time once the page is loaded. Such a scheme has been utilized in previous 
work (e.g., [2]). However, a problem with such a scheme is that the most over- 
loaded page limits the load utilization of the entire table. As we show later, the 
random fluctuations in the distribution of keys per page can significantly affect 
the maximum achievable load. 

We generalize the above approach by placing most of the cell choices asso- 
ciated with a key on the same primary page. We then allow a backup page to 
contain secondary choices of possible locations for a key (usually just one). In 
the worst case we now must access two pages, but we demonstrate experimen- 
tally that we can arrange so that for most keys we only access the primary page, 
leading to close to one page accesses on average. Intuitively, the secondary page 
for each key allows overloaded pages to slough off load constructively to under- 
loaded pages, this distributing the load. We show that we can do this effectively 
offline as well as online by evaluating an algorithm that we find performs well 
even when keys are deleted as well as inserted into the table. 

We note that it is simple to show that using a pure splitting scheme, with 
no backup page, and page sizes s = m d , < 5 < 1, where m is the number 
of memory cells, the load thresholds obtained are asymptotically provably the 
same as for cuckoo hashing without pages. Analysis using such a parametrization 
does not seem suitable to describe real-world page and memory sizes. While 
we conjecture that the load thresholds obtained using the backup approach, 
for reasonable parameters for memory and page sizes, match this bound, at 
this point our work is entirely experimental. We believe this work introduces 
interesting new theoretical problems for cuckoo hashing that merit further study. 

1.1 Related Work 

The issue of coping with pages for hash-based data structures is not new. An 
early reference is the work of Manber and Wu, who consider the effects of pages 
for Bloom filters [18]. Their approach is the simple splitting approach we de- 
scribed above; they first hash a key to a page, and each page then corresponds 
to a separate Bloom filter. The deviations in the number of keys hashed to a 
page yield only small increases in the overall probability of a false positive for 
the Bloom filter, making this approach effective. As we show below, such devia- 
tions have more significant effects on the acceptable load for cuckoo hash tables, 
leading to our suggested use of backup pages. More recent work includes that of 
Woelfel [21], who focuses on perfect external memory dictionaries that require 
additional space for the hash function and for handling insertions and deletions. 

Our work is perhaps superficially related to the body of literature on cuckoo 
hashing where cells (or buckets) can hold multiple keys. Here for searching the 
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whole page or bucket has to be scanned. This topic was first examined by Di- 
etzfelbinger and Weidling [7]; other notable work in this area includes that of 
Lehman and Panigrahy, who prove that "overlapping" buckets can yield im- 
proved space utilization [17]. Here we don't think of the entire page as a bucket, 
as we are thinking of pages as being sufficiently large that we may want to avoid 
searching through a page for a key. Our work can also be considered as related 
to work on using stashes, or extra locations when keys cannot be placed nor- 
mally, with cuckoo hashing [16]. Here, each key can be thought of as having an 
individualized stash corresponding to one or more cells on a separate page. 

A number of papers have recently resolved the longstanding issue regarding 
the load threshold for standard cuckoo hash tables where each key obtains k 
choices [6,11,12,13,14]. Our work re-opens the issue, as we consider the question 
of the effect of pages on these thresholds, if the pages are smaller then m s , such 
as for example polylog(m). 

Practical motivation for this approach includes recent work on real-world 
implementations of cuckoo hashing [2,20]. In [2], where cuckoo hashing algo- 
rithms are implemented on graphical processing units, the question of how to 
maintain page-level locality for cuckoo hash tables arises. Even though work for 
lookups can be done in parallel, the overall communication bandwidth can be a 
limitation in this setting. Ross examines cuckoo hashing on modern processors, 
showing they can be quite effective by taking advantage of available parallelism 
for accessing cache lines [20]. Our approach can be seen as attempting to ex- 
tend this performance, from cache lines to pages, by minimizing the amount of 
additional parallelism required. 

1.2 Our Results 

We give a short summary of the results in the paper. (All results are experimen- 
tal.) Our presented results focus on the setting of four location choices per key. 
The maximum load factor c% of keys with four hash functions and no paging 
is known. With small pages and each key confined to one page, we find using 
an optimal offline algorithm that the maximum achievable load factor is quite 
low, well below c\. However, if each key is given three choices on a primary page 
and a fourth on a backup page, the load factor is quite close to c\, even while 
placing most keys in their primary page, so that most keys can be accessed with 
a single page access. With three primary choices, a single backup choice and 
filling up the table to 95 percent, we find that only about 3 percent of keys need 
to be placed on a backup page (with suitable page sizes). We show that a sim- 
ple variation of the well-known random walk insertion procedure allows nearly 
the same level of performance with online, dynamic placement of keys (including 
scenarios with alternating insertion and deletions) . Our experiments consistently 
yield that at most 5 percent of keys needs to be placed on a backup page with 
these parameters. This provides a tremendous reduction of the number of page 
accesses required for successful searches. For unsuccessful searches, spending a 
little more space for Bloom filters on each page leads to an even smaller number 
of accesses to backup pages. 
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2 Problem Description 



We want to place n keys into m = n/c memory (table) cells where each cell 
can hold a fixed number of I > 1 keys. The value c is referred to as the load 
factor. The memory is subdivided into t pages (sub-tables) of equal size s — m/t. 
(Throughout the paper we assume m is divisible by t.) Each key is associated 
with a primary page and a backup page distinct from the primary page, as well 
as a set of fc distinct table cells, k p on the primary page and k^ = k — k p on the 
backup page. The pages and keys are chosen according to hash functions on the 
key, and it is useful to think of them as being chosen uniformly at random in 
each case. For a given assignment let n p be the number of keys that are placed 
in their primary page and let rib be the number of keys that are placed in their 
backup page. We can state the cuckoo paging problem as follows. 

Problem ( Cuckoo Paging). Find a placement of the n keys such that the fraction 
n p /n is maximized. 

Remark 1. Note that under the standard model, with no backup pages and all 
key locations assumed to be chosen uniformly at random, there is threshold 
load factor c* k t such that whenever c < c£ t a placement exists with probability 
1 — o(l). The recent paper [11] gives the complete picture for all reasonable values 
of k and I. 

Remark 2. As mentioned, if fcb = and page sizes are s = m s ,S > 0, the 
asymptotic threshold load factor is the same as in the setting without pages. 
This is easily proven using tight concentration bounds on the number of keys 
per page. Our interest, however, is in ranges for m that are realistic and not too 
large page sizes s, so that this asymptotic behavior is not an adequate description 
of performance. 

The aim of the paper is to experimentally investigate the potential for saving 
access cost by using primary and backup pages. Appropriate algorithms are 
presented in the next section. For ease of description of the algorithms we also 
use the following bipartite cuckoo graph model as well as the hashing model. 

2.1 Cuckoo Graph Model 

We consider random bipartite graphs G = (L U R,E) with left node set L = [n] 
and right node set R = [m]. The left nodes correspond to the keys, the right 
nodes correspond to the memory cells of capacity I. The set R is subdivided 
into t segments Rq, Ri, . . . , Rt—i, each of size s — m/t, which correspond to 
the separate pages. Each left node x is incident to k = k p + fcb edges where its 
neighborhood N(x) consists of two disjoint sets N p (x) and Nj,(x) determined 
according to the following scheme (all choices are fully random) : choose p from 
[t] (the index of the primary page); then choose k p different right nodes from 
R p to build the set N p (x); next choose b (the index of the backup page) from 
[t] — {p}; and finally choose fcb different right nodes from R\, to build the set 
Nb(x). Let e = {x, y} be an edge where x € L and y G R. We call e a primary 
edge if y £ N p (x) and call e a backup edge if y S Nb(x). 
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3 Algorithms 



Using the cuckoo graph we can restate the problem of inserting the keys as finding 
an orientation of the edge set of the cuckoo graph G such that the indegree of 
all left nodes is exactly 1 and the outdegree of all right nodes is at most I. We 
call such an orientation legal. An edge e = (y, x) with x from L and y from R is 
interpreted as "storing key x in memory cell y." If y is from N p (x) we call x a 
primary key and otherwise we call x a backup key. Each legal orientation which 
has a maximum number of primary keys is called optimal. 

3.1 Static Case 

In the static case, i.e., if the cuckoo graph G is given in advance, there are well- 
known efficient (but not linear time) algorithms to find an optimal orientation 
of G. One possibility is to consider a corresponding minimum cost matching 
problem: Assign costs to each edge from G where primary edges get cost and 
backup edges get cost 1. Then replace each node y from R with £ copies and 
replace each edge to which y is incident with I copies as well. Initially direct all 
edges from left to right. Edges from right to left are matching edges. The mini- 
mum cost matching problem is to find a left-perfect matching (legal orientation) 
with minimum cost (minimum number of backup keys) . The algorithm we used 
to determine such a matching is a variant of the Successive Shortest Path Al- 
gorithm [1] but uses a modified Hopcroft-Karp Algorithm instead of Dijkstra's 
Algorithm for finding augmenting paths of minimal cost. 

Given a bipartite graph with 0-1 edge costs the modified Hopcroft-Karp Al- 
gorithm finds a left-maximum matching of minimum cost as follows. Initially let 
7 = 0. The algorithm works in rounds. In each round we try to find node dis- 
joint augmenting paths (directed paths with free start node from L and free end 
node from R) of cost exactly 7. Consider round number i. For each augmenting 
path found in round i flip the edge orientations and the edge costs along the 
path, and then go to round i + 1. If in round i no such path exists but there is 
an augmenting path of larger costs, increment 7 by one and go to round i + 1; 
otherwise stop the algorithm. Augmenting paths with fixed costs 7 are found via 
a combination of a modified breadth first search (BFS) and depth first search 
(DFS). The BFS starts from all left nodes with in-degree zero Lq (unmatched 
nodes) at the beginning of a round. The search partitions the nodes into layers. 
For each explored node the layer and the costs of the path to this node are 
stored. A node can be explored twice if it is reached by a path of lesser cost. The 
BFS stops at the first level where one or more free nodes of R are reached by 
a path of cost exactly 7. Let Rg be the set of these right nodes. The algorithm 
tries to find node disjoint path between Ro and Lq via DFS, where the search 
can only follow edges between two successive layers. During the recursive descent 
the costs of the path are accumulated. If the DFS reaches a free node and the 
costs are exactly 7, the recursive ascent removes the node labels, flips the edge 
costs and orientations along the path. The algorithm is optimal in the sense that 
it finds a left-maximum matching of minimum costs since we have only integer 
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weights and the costs of the minimum cost augmenting paths are monotonically 
non-decreasing . 

Algorithm 1: Modif iedHopcrof tKarp(bipartite graph G) 

while an augmenting path exists do 
Lo 4— {% | x G L, x unmatched} 
R <r- BFS( 7 ) 

atLeastOnePathFound<s— false 
foreach y £ Ro do 

atLeastOnePathFound^— DFS(y, 0, 7) 

if not atLeastOnePathFound then 7 7 + 1 

BFS(max_cost 7): 

— partitions the nodes into layers, starting from Lq 

— stops at the first layer I with path to a right node with cost equals 7 

— returns the set of free nodes at layer I with path cost equals 7 
DFS(node y, current_cost 7, max_cost 7): 

— recursive descent through the layers given by BFS 

— the current costs of the path are accumulated in 7 

— if a free node is reached and 7 = 7 then the recursive ascent removes the node 
labels, flips the edge costs and orientations along the path and returns true; 
otherwise returns false 



3.2 Dynamic Case 

In the online scenario the cuckoo graph initially consists only of the right nodes. 
To begin let us consider the case of insertions only. The keys arrive and are 
inserted one by one, and with each new key the graph grows by one left node 
and k edges. To find an appropriate orientation of the edges in each insertion 
step, we use a random walk algorithm, which is a modification of the common 
random walk for fc-ary cuckoo hashing [10] but with two additional constraints: 

1. avoid creating backup keys at the beginning of the insertion process, and 

2. keep the number of backup keys below a small fixed fraction. 

For the description of the algorithm we use a dual approach. The pseudocode 
(Algorithm 2) refers to the graph model and the following explanation uses the 
hashing model. We refer to a key's fc p cells on its primary page as primary 
positions, and the cells on its backup page as backup positions. The insertion 
of an arbitrary key x takes one or more basic steps of the random walk, which 
can be separated into the following sub-steps. 

Let x be the key that is currently "nestless", i.e., x is not stored in the memory. 
First check if one of its primary positions is free. If this is the case store x in such 
a free cell and stop successfully. Otherwise toss a biased coin to decide whether 
the insertion of x should be proceed on its primary page or on its backup page. 
— If the insertion of x is restricted to the primary page, randomly choose one 
of its primary positions y. Let x' be the key which is stored in cell y. Store 
x in y, replace x with x', and start the next step of the random walk. 
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Algorithm 2: RandomWalkInsert(node x) 



success <— false 

while globalCounter > and not success do 
if 3 y G N p (x) with outdeg(j/) < I then 
flip edge (x,y); success 4— true 

if not success then 

if randomNumber () < o then 

choose random y G N p (x); choose random x' G N(y) 
flip edge (x,y); flip edge (y,x'); x x' 
else 

if 3 y G N b (x) with outdeg(iy) < £ then 

flip edge (x,y); success <— true 
else 

choose random y G Nt(x); choose random x G N(y) 
flip edge (x,y); flip edge (y,x'); x x' 

globalCountcr ^— globalCountcr — 1 
return success 

C*Thc modification to avoid unnecessary back steps is not shown for the sake of clarity.*) 



— If x is to be stored on its backup page, first check if one of the backup 
positions of x is free. If this is the case store x in such a free cell and stop 
successfully. Otherwise randomly choose one of the backup positions y on 
this page and proceed as in the previous case. 

The matching procedure is slightly modified to avoid unnecessary back steps. 
That is, if a key x displaces a key x' and in the next step x' displaces x" then 
x" = x is forbidden as long as x' has another option on this page. 
The algorithm uses two parameters, 
a - the bias of the virtual coin. This influences the fraction of backup keys, 
b - controls the terminating condition. A global counter is initialized with 
value b • n, which is the maximum number of total steps of the random walk 
summed over all keys. For each basic step the global counter is decremented 
by one. If the limit is exceeded the algorithm stops with "failure". 
Deletions are carried out in a straightforward fashion. To remove a key x, 
first the primary page is checked for x in its possible cells, and if needed the 
backup page can then be checked as well. The cell containing x is marked as 
empty, which can be interpreted as removing the left node x and its k incident 
edges from G. The global counter is ignored in this setting (b = oo). 

4 Experiments 

For each of the following experiments we consider cuckoo graphs G randomly 
generated according to some configuration k — (c, m, s, k p , k\) where c is the 
quotient of left nodes (keys) and right nodes (table cells), m is the total number 
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of right nodes, s is the page size, and fc p ,fcb are the number of primary and 
backup edges of each left node. In the implementation the left and right nodes 
were simply the number sets [n] and [m]. All random choices were made via 
the pseudo random number generator MT19937 "Mersenne Twister" of the GNU 
Scientific Library [15]. 

If not stated otherwise the total number of cells is m = 10 6 and pages are 
of size s — 10 l ,« < 6. Our main focus is on situations where I = 1, i.e., each 
cell can hold one key. Moreover we restrict ourselves to the cases k p = 3, fcb = 1 
and (just for comparison) fc p = 4 and fcb = 0. While we have done experiments 
with other parameter values, we believe these settings portray the main points. 
Also, while we have computed sample variances, in many cases they are small; 
this should be assumed when they are not discussed. 

4.1 Static Case 

Experimental results for the static case determine the limits of our approach and 
serve as a basis of comparison for the dynamic case. 

Setup and Measurements. First of all we want to see the limits of cuckoo 
hashing with pages if there are no backup options at all. Note that for fixed page 
size s and larger and larger table size m the fraction of keys that can be placed 
decreases. For n — c ■ m keys the load of each page is approximately Poisson 
distributed with parameter c • s; asymptotically the success probability can be 
estimated as 

O ((Pr (Po(c .s) < .))') = o((£ ^ • e—)*) , (1) 

for t = m/s. which approaches for m — > oo. 

For the case with backup options we try to get an approximation for possible 
threshold densities. Let c~ m and c+ m be the loads n/m that identify the tran- 
sition from where there is a feasible orientation and where there is no feasible 
orientation of G without and with the backup option respectively. To get approx- 
imations for cj m and c+ TO we study different ranges of load factors [c start , c end ]. 
Specifically, for all c where c = c start +i ■ 10~ 4 < c cnd , and i = 0, 1, 2, . . . , we con- 
struct a random graphs and measure the failure rate A at c. We fit the sigmoid 
function 

f(c;x,y) = (l + eM-(c-x)/y)y 1 (2) 

to the data points (c, A) using the method of least squares. The parameter x 
(inflection point) is an approximation of c~ m and cj m respectively. With ^2 rcs 
we denote the sum of squares of the residuals. 

Furthermore, for different c and page sizes s, we are interested in the maxi- 
mum ratio r p = n p /n or load a p = n p /m of primary keys, respectively. 

For a fixed page p let w be the number of keys that have primary page p but 
are inserted on their backup page. Since the number of potential primary keys 
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for a page follows a binomial distribution, some pages will be lightly loaded and 
therefore have a small value of w or even w = 0. Some pages will be overloaded 
and have to shed load, yielding a large value of w. We want to study the relative 
frequency of the values w. 

Results. Here we consider results from an optimal placement algorithm. 

/. Table 1 gives approximations of the loads where cuckoo hashing with paging 
and k = 4 hash functions has failure rate A = 0.5 in the case of 1 or backup 
pages. With no backup pages the number of keys that can be stored decreases 
with decreasing page size and the success probability around c~ m converges less 
rapidly, as demonstrated clearly in Fig. 1. This effect becomes stronger as the 
pages get smaller. For this reason the range of load factors [c start , c cnd ] of sub- 
table (a) grows with decreasing page size. Using only one backup edge per key 
almost eliminates this effect. In this case the values c+ m seem to be stable for 
varying s and are very near to the theoretical threshold of standard 4-ary cuckoo 
hashing, which is c\ ss 0.976770; only in the case of very small pages s = 10 can 
a minor shift of c+ m be observed. The position of c+ m as well as the slope of the 
fitting function appear to be quite stable for all considered page sizes. 
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Table 1. Approximations of the load factors that are the midpoints of the transition 
from failure rate to failure rate 1 (without and with backup option) via fitting function 
(2) to a series of data points. For each data point the failure rate among 100 random 
graphs was measured. The grey rows correspond to the plots of Fig. 1. 



77. The average of the maximum fraction of primary keys, allowing one backup 
option, is shown in Table 2. The fraction decreases with increasing load factor 
c and decreases with decreasing page size s as well. Interestingly, for several 
parameters, we found that an optimal algorithm finds placements with more 
than C3 • m keys sitting in one of their 3 primary positions, where C3 « 0.917935 
is the threshold for standard 3-ary cuckoo-hashing. That is, more keys obtain one 
of their primary three choices with three primary and one backup choice than 
what could be reached using just three primary choices even without paging. 

Figure 2 depicts the relative frequency of the values w among 10 5 pages 
for selected parameters (c, s) = (0.95, 10 3 ). In this case about 17 percent of all 



9 




O.E 

0.4 - 
0.3 
0.2 - 
0.1 



measured data 

(l +e -(M/»))t,VH 



£ reg = 0.00717218 
iB = 0.976611 



0.62 0.63 0.64 0.65 0.66 0.67 0.68 



0.1)75 0.9755 0.970 0.9765 0.977 0.0775 0.078 0.9785 0.979 



(a) s = 10 2 , 641 data points, k p = 4, kh = (b) s = 10 2 , 41 data points, k p = 3, kh = 1 
Fig. 1. Point of transition (a) without and (b) with backup pages. 
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Table 2. Average (among 100 random graphs) of the fraction of keys that can be 
placed on their primary page for different page sizes s and k p = 3, kh = 1. The failure 
rate is A = 0. For c > 0.98 the random graph did not admit a solution anymore. The 
entries of the grey cells are larger than c%,. 



pages do not need backup pages, i.e., w = 0. This is consistent with the idea 
that pages with a load below ■ s will generally not need backup pages. The 
mean w is about 2.5 percent of the page size s and for about 87.6 percent of the 
pages the value w is at most 5 percent of the page size. The relative frequency 
of w being greater than 0.1s is very small, about 1.1 ■ 10~ 3 . 

Summary. We observed that using pages with (fc p , kh) = (3, 1) we achieve loads 
very close to the c\ threshold (c+ m ~ c|). Moreover the load a p from keys placed 
on their primary page a p is quite large, near or even above cj. 

Let X be the average (over all keys that have been inserted) number of 
page requests needed in a search for a key x, where naturally we first check 
the primary page. If (fc p , fcb) = (3,1) and a key was equally likely to be in 
any of its locations, the expected number of page requests E(X) would satisfy 
E(X) = 1.25. If (k p , kh) = (3, 1) and c is near c| then we have roughly E(X) « 
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Fig. 2. frequency of w = is 0.169, (c, s) = (0.95, 10 3 ), a = 10 3 , A = 



Cg/c • 1 + (1 — Cg/c) ■ 2. For example, for (c, s) = (0.95, 10 3 ), using the values of 
Table 1 we find E(X) « 0.974 ■ 1 + 0.026 ■ 2 < 1.03. 

Now assume we perform a lookup for a key ir not in the table. The disadvan- 
tage of using two pages per key is that now we always require two page requests, 
i.e., E(X) = 2. This can be circumvented by storing an additional set member- 
ship data structure, such as a Bloom filter [3], for each page p representing the 
w many keys that have primary page p but are inserted on their backup page. 

One can trade off space, computation, and the false positive probability of 
the Bloom filter as desired. As an example, suppose the Bloom filters use 3 
hash functions and their size corresponds to just one bit per page cell. In this 
case, we can in fact use the same hash functions that map keys to cell locations 
for our Bloom filter. Bounding the fraction of 1 bits of a Bloom Filter from 
above via (fc p • w)/s, the distribution of w as in Fig. 2 leads to an average false 
positive rate of less than 0.15 percent and therefore an expected number of page 
requests E(X) of less than 1.0015 for unsuccessful searches. One could reduce 
false positives even further using more hash functions, or use less space. 

4.2 Dynamic Case 

We have seen the effectiveness of optimal offline cuckoo hashing with paging. We 
now investigate whether similar placements can be found online, by considering 
the simple random walk algorithm from Sect. 3.2. We begin with the case of 
insertions only. 

Setup and Measurements. Along with the failure rate A, the fraction of 
primary keys r p and corresponding load a p , and the distribution of the number 
of keys w inserted on their backup page, we consider two more performance 
characteristics: 

#st - the average number of steps of the random walk insertion procedure. A 
step is either storing a key x in a free cell y or replacing an already stored 
key with the current "nestless" key. 

#pr - the average number of page requests over all inserted items. Here each 
new key x requires at least one page request, and every time we move an 
item to its backup page, that requires another page request. 
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We focus on characteristics of the algorithm with loads near c+ m , varying 
the number of table cells m = 10 5 , 10 6 , 10 7 and page sizes s = 10, 10 2 , 10 3 . The 
performance of the algorithm heavily depends on the choice of parameters o and 
b. Instead of covering the complete parameter space we first set b to infinity and 
use the measurements to give insight into the performance of the algorithm for 
selected values of o. 

In addition we want to explore whether we can expect a sufficiently low 
failure probability of the random walk algorithm, at least for some selected sets 
of parameters ir including practical values of b. For this we tested the following 
null hypothesis Hq (tt) ="If one uses parameter set ir then Algorithm 2 fails with 
probability at least p." To test the null hypothesis for a specific ir we performed 
the random experiment "insertion of n — c ■ m keys with Algorithm 2" a times. 
Let A(tt) be the event that all of the a many random experiments for a given tt 
ended successfully. Then we have: 

Pr (A(n) | H (n)) < (1 - p) a < exp(-p • a) . (3) 

For example if a = 10 6 and p = 10~ 5 we have Pi(A(n) \ H (n)) < exp(— 10) ~ 
4.54 • 10~ 5 . Hence if we observe A(ir) we may reject the null hypothesis with 
high confidence. 

We also study the influence of a for a fixed configuration. We vary a to see 
qualitatively how the number of primary keys as well as the number of steps and 
page requests depend on this parameter. 

It is well known that hashing schemes can perform differently in settings with 
insertions and deletions rather than insertions alone, so we investigate whether 
there are substantial differences in this setting. Specifically, we consider the table 
under a constant load by alternating insertion and deletion steps. 

Results. Here we consider results from the random walk algorithm. 

/. Tables 3 and 4 show the behavior of the random walk algorithm with loads 
near c+ m for (c, a) = (0.95, 0.97) and (c, a) = (0.97, 0.90). The number of allowed 
steps for the insertion of n keys is set to infinity via b = oo. The number of trials 
a per configuration is chosen such that a ■ m = 10 9 (keeping the running time 
for each configuration approximately constant). 

We first note that with these parameters the algorithm found a placement 
for the keys in all experiments; failure did not occur. For fixed page size the 
sample means are almost constant; for growing page size the load Wp increases, 
while #st and #pr decrease, with a significant drop from page size 10 to 100. 
For our choices of a the random walk insertion procedure missed the maximum 
fraction of primary keys by up to 2 percent for c = 0.95 and by up to 6 percent 
for c = 0.97 and needs roughly the same average number of steps (for fixed page 
size) . 

//. To get more practical values for b we scaled up the values #st from Ta- 
bles 3 and 4 and estimated the failure probability for suitable parameter sets 
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7T = (c, S, 0, b) € {(0. 95, 10^0. 97, 30), (0. 95, 10 3 ,0.97, 25), (0.97, 10^. 90, 30), (0. 97, 10 3 , 0.90, 25)}. 

For all these parameter sets we observed a failure rate of zero among a — 10 6 
attempts (event A(ir)). We can conclude at a level of significance of at least 



3 -io 



1 



is at most 10 



that for these sets the failure probability of the random walk algorithm 



-s 



III. Figure 3 shows how parameter a influences the ratio of primary keys r p , 
number of insertion steps #st and the number of page requests #pr. 
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Fig. 3. (c, s, b) = (0.95, 10 3 , 30), a = 10 3 , A = 

The mean fraction of primary keys grows linearly and #st grows non- 
linearly with growing a. For a = 0.98 the gap between the optimal fraction of 
primary keys and the fraction reached by the random walk procedure is about 
1 percent. The value of #st also depends nonlinearly on a and reaches a local 
minimum at a = 0.95. The sample variances are quite small and stable except 
for 5[#st] and large a (near 0.98). 

IV. The results for alternating insertions and deletions for parameters (c, s) = 
(0.95, 10 3 ) and (a, b) = (0.97, 30) are shown in Fig. 4. We measured the current 
fraction of primary keys r p and the number of insertion steps with respect to each 
key #stk C y Recall that #st is the average number of insertion steps concerning 
all keys. In the first phase (insertions only) the average number of steps per 
key grows very slowly at the beginning and is below 10 when reaching a load 
where about 1 percent of current keys are backup keys. After that #st koy grows 
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10 s 
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158.618752 
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0.860217 


0.817206 


158.645056 


1.092781 
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0.003914 
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0.891509 


22.807328 


1.081478 


2.248953 
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22.813986 


0.104012 
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10 2 


10 5 


0.938412 


0.891491 
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0.010862 


2.249201 
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10 4 


10 2 
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16.580150 
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0.000182 




10 7 


ID 2 
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0.907943 
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0.005534 


1.893248 


0.000019 



Table 3. Characteristics of Algorithm 2 for (c, a, b) = (0.95, 0.97, oo). A = 0. 
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0.792298 


152.873602 


10.759338 
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0.886997 


0.860387 


23.320507 


2.731285 


5.361922 
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0.886992 


0.860382 


23.289233 


0.256942 


5.355625 


0.010218 
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0.886985 


0.860375 


23.268641 


0.024796 


5.351518 


0.000986 


10 3 
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0.898281 


0.871332 


19.497032 


1.550490 


4.607751 


0.061739 
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10 3 


0.898232 


0.871285 


19.486312 


0.146267 


4.605481 


0.005816 
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10 4 


0.898235 


0.871288 


19.493215 


0.012744 


4.606893 


0.000507 



Table 4. Characteristics of Algorithm 2 for (c, o, b) = (0.97, 0.90, oo). A = 0. 




1-10 5 3-10 5 5 -10 s 7 ■10 s 9-10 5 1.1 ■ 10 5 1.3 ■ 10 6 1.5 • 10 s 1.7 • 10 5 1.9 ■ 10 s 



key number 

Fig. 4. (c, s, o, b) = (0.95, 10 3 , 0.97, 30), a = 10 3 , A = 0, The ordinate of the right half 
of the upper plot is in log scale. 



very fast up to almost 10 3 (for the last few keys), which is the page size. The 
sample mean of the average number of steps #st up to this point is about 16.6. 
Similarly the sample mean of the fraction of primary keys decreases very 
slowly at the beginning and decreases faster at the end of the first phase. Up 
to load about 82.6 percent the fraction of backup keys is below 1 percent. In 
the second phase (deletions and insertions alternate) #sti i;ey and #st decrease 
and quickly reach a steady state. Since the decrease of ¥p is marginal but the 
drop #st] <oy is significant we may conclude that the overall behavior is better in 
steady state than at the end of the insertion only phase. Moreover in an extended 
experiment with n = c ■ m insertions and 10 • n delete-insert pairs the observed 
equilibrium remains the same and therefore underpins the conjecture that Fig. 4 
really shows a "convergence point" for alternating deletions and insertions. 
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V. Figure 5 shows the relative frequency of the values w among 10 5 pages for 
(c, s) — (0.95, 10 3 ) and (a, b) = (0.97, 30) at the end of the insertion only phase, 
given by Fig. 5 (a), and at the end of the alternation phase, given by Fig. 5 (b). 
Note that Fig. 5 (a) corresponds to Fig. 2 with respect to the graph parameters. 




20 40 00 80 100 120 140 160 * () 20 40 60 80 100 120 140 160 



(a) insertion only phase (b) insertion and deletion phase 

Fig. 5. frequency of w, (c, s) = (0.95, 10 3 ), a = 10 3 , A = 

The shapes of the distributions differ only slightly, except that in the second 
phase the number of backup keys is larger. In comparison with the values given 
by the optimal algorithm in Fig. 2 the distribution of the w values is more skewed 
and shifted to the right. 

Summary. A simple online random-walk algorithm, with appropriately chosen 
parameters, can perform quite close to the optimal algorithm for cuckoo hashing 
with paging, even in settings where deletions occur. 

With parameters (c, s) = (0.95, 10 3 ) and (a, b) = (0.97, 30) the expected 
number of page requests E(X) for a successful search is about 1.044, using the 
values from Table 3. With the Bloom filter approach described in Sect. 4.1 (which 
can be done only after finishing the insertion of all keys), the distribution from 
Fig. 5 (a) gives an expected number of page requests for an unsuccessful search 
of less than 1.0043. Both values are only slightly higher than those resulting from 
an optimal solution. One can instead use counting Bloom filters [4] to improve 
performance for unsuccessful searches with online insertions and deletions, at 
the cost of more space. 

4.3 Small Pages 

We have seen that if one uses one backup option then the page size has only 
marginal influence on the existence of a legal orientation of G but heavily influ- 
ences the maximum fraction of primary keys (in the dynamic case as well as in 
the static case). Tables 2, 3 and 4 show that the smaller the page the smaller the 
fraction of primary keys, with a significant decrease from page size 100 to 10. In 
order to attenuate this downside one can use the following variant. Let k p = 1 
and fcb = 1- We use the idea of blocked cuckoo hashing [5,7,9] where each table 
cell gets capacity I for some constant I. One can think of pages of size exactly 
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one cell. Some reference (theoretical) thresholds c 2 t for the existence of a legal 
orientation of the corresponding cuckoo graphs are given in Table 5 [5,9]; and 
experimental threshold values are given in Figure 6. (Note that m is the number 
of table cells of capacity I and we refer to the normalized values c 2 e /£.) 
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0.897012 


0.959154 


0.980370 


0.989551 


0.997853 


0.999143 


0.999928 



Table 5. Theoretical thresholds values c^e- 



+ measured data 

(l + e-''-'""')- 1 



£,„ = 0.00330075 

+ I I = 0.980358 

t , , , 

0.9785 0.979 0.9795 0.98 0.9805 0.981 0.9815 0.982 

c/t 

(a) I = 4 



measured data 

(l + e -(— «>/»)-! 



£ r „ = 0.83104 
X = 0.999817 



0.998 0.9985 0.999 0.9995 



1.0005 1.001 1.0015 



(b) i = 16 



Fig. 6. to = W 6 /£, kp = 1, k b = 1, a = 10 2 

Our aim remains to store as many keys as possible in their primary cell 
while keeping the load c near the threshold c 2 e . Table 6 gives optimal (offline) 
results. They indicate that with respect to the ratio of primary keys the variant 
{k p ,k h ,s,£) = (1,1,1,10) is slightly better than (k p ,k h ,sj) = (3,1,10,1). An 
advantage is that for I > 3 the (known) thresholds c 2 Jt are higher than the 
values cj m which are near c\ {c\ « 0.976770). For example, with I — 16 and 
load factor cjl — 0.99 a fraction 0.904 of the keys can be stored in their primary 
cell, thus reducing the expected number of cell requests for successful searches 
from 1.5 when each key is equally likely to be in either location to less than 1.1. 



5 Conclusion 



Our results suggest that cuckoo hashing with paging may prove useful in a 
number of settings where the cost of multiple lookups might otherwise prove 
prohibitive. Perhaps the most interesting aspect for continuing work is to obtain 
provable performance bounds for cuckoo hashing with pages. Even in the case 
of offline key distribution with one additional choice on a second page we do not 
have a formal result proving the threshold behavior we see in experiments. 
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0.822251 


0.740026 


0.898282 


0.808454 


0.913798 


0.822418 


0.940022 


0.846020 


0.91 


0.815652 


0.742244 


0.894259 


0.813776 


0.910014 


0.828113 


0.936509 


0.852223 


0.92 


0.808867 


0.744157 


0.890040 


0.818837 


0.906196 


0.833700 


0.932937 


0.858302 


0.93 


0.801424 


0.745325 


0.885679 


0.823682 


0.902177 


0.839024 


0.929188 


0.864145 


0.94 


0.793452 


0.745845 


0.881098 


0.828232 


0.898052 


0.844169 


0.925333 


0.869813 


0.95 


0.784526 


0.745300 


0.876222 


0.832411 


0.893687 


0.849003 


0.921360 


0.875292 


0.96 


0.774254 


0.743283 


0.870778 


0.835947 


0.889150 


0.853584 


0.917347 


0.880653 


0.97 


0.761745 


0.738893 


0.864615 


0.838676 


0.884317 


0.857787 


0.913244 


0.885846 


0.98 


0.743799 


0.728923 


0.857017 


0.839876 


0.878957 


0.861378 


0.908929 


0.890751 


0.99 


no solution 


0.847632 


0.839156 


0.870738 


0.862030 


0.904474 


0.895429 



Table 6. Maximum fraction of primary keys among 100 random graphs, for different 
block sizes I, m = 10 6 /£. For c/l = 0.98, 1 = 4 the failure rate is A = 0.01 and for 
cjl = 0.99, i = 4 we have A = 1; otherwise A = 0. 
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